Korean prosody generation and artificial neural networks

نویسندگان

Kyung-Joong Min

Un-Cheon Lim

چکیده

To hear more natural synthetic speech generated by a Korean TTS (Text-To-Speech) system, we have to know all the possible prosodic rules in Korean language. We can extract these rules from linguistic, phonetic knowledge or by analyzing real speech. In general, all of these rules are integrated into a prosody-generation algorithm in TTS. But this algorithm cannot cover all the possible prosodic rules in one language and it is not perfect, so the quality of synthesized speech cannot be as good as we expect. So we propose artificial neural networks(ANNs) that can learn the prosodic rules in Korean language. Multi-Layer Perceptron(MLP) using an error Back Propagation(BP) algorithm had been selected as ANNs for this study. To train and test these ANNs, we made a corpus that consists of some meaningful sentences that were made from a corpus of phonetically balanced(PB) isolated words. These sentences were read by one male speaker, recorded, and collected as a speech database. We had analyzed recorded speech to extract prosodic information of each phoneme, and made target and test patterns for artificial neural networks. We found out that ANNs could learn the prosody from real speech and generate the prosody of a sentence when it was given to ANNs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Number of output nodes of artificial neural networks for Korean prosody generation

We’d been studying artificial neural networks(ANNs) that can learn and generate the prosody of a Korean sentence. To hear more natural synthetic speech generated by a Korean TTS (Text-To-Speech) system, we have to know all the possible prosodic rules about Korean language and integrate all of these rules into an algorithm. We can get these rules from linguistic, phonetic knowledge or by analyzi...

متن کامل

Prosody generation with a neural network: weighing the importance of input parameters

As an alternative to synthesis-by-rule, the use of neural networks in speech synthesis has been successfully applied to prosody generation, yet it is not known precisely which input parameters are responsible for good results. The approach presented here tries to quantify the contribution of each input parameter. This is done first by comparing the mean errors of networks trained with only one ...

متن کامل

GENERATION OF MULTIPLE SPECTRUM-COMPATIBLE ARTIFICIAL EARTHQUAKE ACCELEGRAMS WITH HARTLEY TRANSFORM AND RBF NEURAL NETWORK

The Hartley transform, a real-valued alternative to the complex Fourier transform, is presented as an efficient tool for the analysis and simulation of earthquake accelerograms. This paper is introduced a novel method based on discrete Hartley transform (DHT) and radial basis function (RBF) neural network for generation of artificial earthquake accelerograms from specific target spectrums. Acce...

متن کامل

Duration Control by Asymmetric Causal Retro-Causal Neural Networks

The generation of pleasant prosody parameters is very important for speech synthesis. A prosody generation unit can be seen as a dynamical system. In this paper sophisticated time-delay recurrent neural network (NN) topologies are presented which can be used for the modeling of dynamical systems. Within the prosody prediction task left and right context information is known to influence the pre...

متن کامل

Duration Modeling for Ar Synthesi

Duration modeling is a fundamental task of prosody generation for Text To Speech (TTS) systems. The objective of this task is to predict the duration of a speech unit from its phonological representation. Duration modeling has a significant influence on the intelligibility and the naturalness of the synthesized speech. This paper presents a Neural Network (NN) based approach to predict the dura...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Korean prosody generation and artificial neural networks

نویسندگان

چکیده

منابع مشابه

Number of output nodes of artificial neural networks for Korean prosody generation

Prosody generation with a neural network: weighing the importance of input parameters

GENERATION OF MULTIPLE SPECTRUM-COMPATIBLE ARTIFICIAL EARTHQUAKE ACCELEGRAMS WITH HARTLEY TRANSFORM AND RBF NEURAL NETWORK

Duration Control by Asymmetric Causal Retro-Causal Neural Networks

Duration Modeling for Ar Synthesi

عنوان ژورنال:

اشتراک گذاری